Vector and Scalar Projections
Module 02: Computational Linear Algebra-1
The University of Alabama
2026-03-23
Data lives in vector spaces. Learning depends on distance, similarity, and direction.
Geometry explains:

Geometry gives meaning to the data.
These questions lead to norms and inner products.

\[\|x\| \]

Euclidean (\(\ell_2\)) – Most common in ML 
Manhattan (\(\ell_1\)) – Used in regularization 
Supremum (\(\ell_\infty\)) – Largest component 
Visualising the ‘Unit Ball’ (points where distance is 1) for different norms.

A unit vector has length 1:
\[\|\hat{x}\| = 1\]
Normalization converts any vector to a unit vector:
\[\hat{x} = \frac{x}{\|x\|}\]
Why normalize?
Example:
\[x = \begin{bmatrix} 3 \\ 4 \end{bmatrix}, \quad \|x\| = 5\]
\[\hat{x} = \frac{1}{5}\begin{bmatrix} 3 \\ 4 \end{bmatrix} = \begin{bmatrix} 0.6 \\ 0.8 \end{bmatrix}\]
Check: \(\|\hat{x}\| = \sqrt{0.6^2 + 0.8^2} = 1\) ✓
Given a norm, define distance:
\[d(x,y) = \|x - y\|\]

Properties:
Distance = length of displacement
Let \(\mathbf{x} = \begin{bmatrix} 3 \\ -4 \end{bmatrix}\). Calculate:
An inner product is a function \(\langle \cdot, \cdot \rangle : V \times V \to \mathbb{R}\) satisfying:
Definition properties:
Linearity
e.g., \(\langle \alpha x , y \rangle = \alpha\langle x,y \rangle = \langle x,\alpha y \rangle\)
Symmetry
e.g., \(\langle x,y \rangle = \langle y,x \rangle\)
Positive Definiteness
e.g., \(\langle x,x \rangle \ge 0\) ; \(\langle x,x \rangle = 0 \iff x = 0\)
The Standard Dot Product:
\[\langle x, y \rangle = \sum_i x_i y_i = \|x\|\|y\|\cos(\theta)\]

Inner product between vector and itself is the norm squared!
Inner products naturally give us a way to measure length. We call this the Induced Norm.
\[ \|x\| = \sqrt{\langle x, x \rangle} \]
Example: Using the standard dot product \(\langle x, y \rangle = \sum x_i y_i\):
\[ \|x\| = \sqrt{x \cdot x} = \sqrt{\sum x_i^2} \]
This recovers the Euclidean Norm!
Defined via:
\[\cos(\theta) = \frac{\langle x,y \rangle}{\|x\|\,\|y\|}\]

In high-dimensional spaces (like text analysis), we often care about direction, not magnitude.
\[\text{Cosine Similarity} = \frac{\langle x, y \rangle}{\|x\| \|y\|}\]

Consider \(\mathbf{u} = \begin{bmatrix} 1 \\ 5 \end{bmatrix}\) and \(\mathbf{v} = \begin{bmatrix} 5 \\ -1 \end{bmatrix}\).
Two vectors are orthogonal if:
\[\langle x,y \rangle = 0\]
Interpretation:

Scalar projection (length of projection):
\[\text{comp}_y(x) = \|x\|\cos(\theta) = \frac{\langle x,y \rangle}{\|y\|}\]
Vector projection (actual vector):
\[\mathrm{proj}_y(x) = \frac{\langle x,y \rangle}{\|y\|^2} y = \frac{\langle x,y \rangle}{\langle y,y \rangle} y\]
Used in:
Vector and Scalar Projections
Project vector \(\mathbf{b} = \begin{bmatrix} 2 \\ 4 \\ 1 \end{bmatrix}\) onto \(\mathbf{a} = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}\).
1. Scalar Projection (\(\frac{\mathbf{b} \cdot \mathbf{a}}{\|\mathbf{a}\|}\)):
Answer: \(3\sqrt{2} \approx 4.24\) (\(\mathbf{b}\cdot\mathbf{a}=6\), \(\|\mathbf{a}\|=\sqrt{2}\))
2. Vector Projection (\(\text{proj}_\mathbf{a} \mathbf{b}\)):
\(\begin{bmatrix} 3 \\ 3 \\ 0 \end{bmatrix}\)
A basis \(\{v_i\}\) is orthonormal if every vector has unit length and all are mutually orthogonal.
Why use orthonormal bases?
Is the set \(S\) an orthonormal basis for \(\mathbb{R}^2\)?
\(S = \left\{ \begin{bmatrix} \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} \end{bmatrix}, \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix} \right\}\)
Check Conditions:
Conclusion: Yes, it is an orthonormal basis!
Watch how different transformations change the area of the unit square!
The determinant measures how much a transformation scales area.
For a 2×2 matrix: \[\det\begin{pmatrix} a & b \\ c & d \end{pmatrix} = ad - bc\]
Geometric Meaning:
Quick Examples:
| Matrix | det | Area Factor |
|---|---|---|
| \(\begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix}\) | 6 | 6× bigger |
| \(\begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}\) | 1 | Same area |
| \(\begin{pmatrix} 2 & 4 \\ 1 & 2 \end{pmatrix}\) | 0 | Collapsed! |
The determinant tells you how the transformation scales space!
det > 0
✅ Orientation Preserved
The “handedness” stays the same (counterclockwise stays counterclockwise)
det < 0
🔄 Orientation Flipped
Like looking in a mirror — left becomes right
det = 0
💀 Space Collapsed
2D → 1D line (or point). Matrix is singular — no inverse!
det = 0 means the transformation loses information — you can’t undo it!
import numpy as np
#> Define matrices
A = np.array([[2, 0],
[0, 3]]) # Scaling matrix
B = np.array([[1, 2],
[3, 4]]) # General matrix
C = np.array([[2, 4],
[1, 2]]) # Singular matrix (det = 0)
#> Compute determinants
print(f"det(A) = {np.linalg.det(A):.2f}") #> Expected: 6
print(f"det(B) = {np.linalg.det(B):.2f}") #> Expected: -2
print(f"det(C) = {np.linalg.det(C):.2f}") #> Expected: 0det(A) = 6.00
det(B) = -2.00
det(C) = 0.00
Use np.linalg.det(A) to compute the determinant of any matrix!
Definition: The inverse \(A^{-1}\) “undoes” the transformation \(A\):
\[A A^{-1} = A^{-1} A = I\]
Geometric Intuition:
2×2 Inverse Formula:
\[A = \begin{pmatrix} a & b \\ c & d \end{pmatrix} \Rightarrow A^{-1} = \frac{1}{\det(A)} \begin{pmatrix} d & -b \\ -c & a \end{pmatrix}\]
Key Insight: Swap \(a \leftrightarrow d\), negate \(b\) and \(c\), divide by determinant.
The inverse only exists when det(A) ≠ 0!
Given: \[A = \begin{pmatrix} 3 & 1 \\ 2 & 4 \end{pmatrix}\]
Step 1: Calculate determinant \[\det(A) = 3 \times 4 - 1 \times 2 = 12 - 2 = 10\]
Step 2: Apply the formula (swap, negate, divide) \[A^{-1} = \frac{1}{10} \begin{pmatrix} 4 & -1 \\ -2 & 3 \end{pmatrix} = \begin{pmatrix} 0.4 & -0.1 \\ -0.2 & 0.3 \end{pmatrix}\]
Step 3: Verify: \(A \cdot A^{-1} = I\)
Always verify your inverse by checking that \(A \cdot A^{-1} = I\)!
import numpy as np
#> Define a matrix
A = np.array([[3, 1],
[2, 4]])
#> Compute inverse
A_inv = np.linalg.inv(A)
print("\nA^(-1) =\n", A_inv)
#> Verify: A @ A_inv = I
print("\nA @ A^(-1) =\n", A @ A_inv)
A^(-1) =
[[ 0.4 -0.1]
[-0.2 0.3]]
A @ A^(-1) =
[[1. 0.]
[0. 1.]]
Use np.linalg.inv(A) — but only if you really need the inverse!
Different coordinate systems describe the same vector differently.
Example: A point at \((3, 2)\) in standard coordinates might be \((1, 1)\) in a rotated coordinate system!
Same point in space, different numbers to describe it!
If \(B\) contains new basis vectors as columns:
To convert TO new basis: \[[\mathbf{v}]_{\text{new}} = B^{-1} [\mathbf{v}]_{\text{standard}}\]
To convert FROM new basis: \[[\mathbf{v}]_{\text{standard}} = B [\mathbf{v}]_{\text{new}}\]
Key Insight: \(B^{-1}\) converts TO the new basis, \(B\) converts FROM it.
The inverse of a basis matrix converts coordinates between systems!
Problem: Convert \(\mathbf{v} = \begin{pmatrix} 4 \\ 3 \end{pmatrix}\) to a stretched basis.
New basis: \(\mathbf{b}_1 = \begin{pmatrix} 2 \\ 0 \end{pmatrix}\), \(\mathbf{b}_2 = \begin{pmatrix} 0 \\ 1 \end{pmatrix}\)
Step 1: Build basis matrix \(B = \begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix}\)
Step 2: Find inverse \(B^{-1} = \begin{pmatrix} 0.5 & 0 \\ 0 & 1 \end{pmatrix}\)
Step 3: Convert: \([\mathbf{v}]_{\text{new}} = B^{-1} \mathbf{v} = \begin{pmatrix} 2 \\ 3 \end{pmatrix}\)
The point (4,3) in standard coords = (2,3) in the stretched basis!
🧮 Practice Problem
Given: New basis vectors \(\mathbf{b}_1 = \begin{pmatrix} 1 \\ 1 \end{pmatrix}\) and \(\mathbf{b}_2 = \begin{pmatrix} -1 \\ 1 \end{pmatrix}\)
Convert: \(\mathbf{v} = \begin{pmatrix} 2 \\ 4 \end{pmatrix}\) to the new basis.
Solution:
\(B = \begin{pmatrix} 1 & -1 \\ 1 & 1 \end{pmatrix}\), \(\det(B) = 2\)
\(B^{-1} = \frac{1}{2}\begin{pmatrix} 1 & 1 \\ -1 & 1 \end{pmatrix}\)
\([\mathbf{v}]_{\text{new}} = B^{-1}\mathbf{v} = \begin{pmatrix} 3 \\ 1 \end{pmatrix}\)
Goal:
Convert any linearly independent set into an orthonormal basis
Algorithm:

Setup: Predicting House Price using two features:
The Problem: These are highly correlated! (Bigger house \(\to\) more rooms).
The Solution (Gram–Schmidt): Create a new feature \(x_2^\perp\): \[x_2^\perp = x_2 - \text{proj}_{x_1}(x_2)\] “Rooms that CANNOT be explained by Size”
Now \(x_1\) and \(x_2^\perp\) are orthogonal. The model becomes stable and interpretable.
Given input basis \(u_1,\dots,u_n\):
Convert \(u_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}, u_2 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}\) into an orthonormal basis.
\[v_1 = u_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}\]
\[v_2 = u_2 - \frac{u_2 \cdot v_1}{v_1 \cdot v_1} v_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix} - \frac{1}{2} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 0.5 \\ -0.5 \end{bmatrix}\]
\[e_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} \end{bmatrix}, \quad e_2 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix}\]
In practice, we use QR Decomposition (numerically stable):
Visualizing Decorrelation (Sparsity of Correlations):
Before (\(X^T X\))
High Values Everywhere
(Correlated)
⬛️ ⬛️ ⬛️
⬛️ ⬛️ ⬛️
⬛️ ⬛️ ⬛️
\(\to\)
After (\(Q^T Q\))
Diagonal / Sparse
(Identity Matrix)
🟩 ⬜️ ⬜️
⬜️ 🟩 ⬜️
⬜️ ⬜️ 🟩
This reveals dependence automatically!




The University of Alabama